155 research outputs found
Instantaneous PSD Estimation for Speech Enhancement based on Generalized Principal Components
Power spectral density (PSD) estimates of various microphone signal
components are essential to many speech enhancement procedures. As speech is
highly non-nonstationary, performance improvements may be gained by maintaining
time-variations in PSD estimates. In this paper, we propose an instantaneous
PSD estimation approach based on generalized principal components. Similarly to
other eigenspace-based PSD estimation approaches, we rely on recursive
averaging in order to obtain a microphone signal correlation matrix estimate to
be decomposed. However, instead of estimating the PSDs directly from the
temporally smooth generalized eigenvalues of this matrix, yielding temporally
smooth PSD estimates, we propose to estimate the PSDs from newly defined
instantaneous generalized eigenvalues, yielding instantaneous PSD estimates.
The instantaneous generalized eigenvalues are defined from the generalized
principal components, i.e. a generalized eigenvector-based transform of the
microphone signals. We further show that the smooth generalized eigenvalues can
be understood as a recursive average of the instantaneous generalized
eigenvalues. Simulation results comparing the multi-channel Wiener filter (MWF)
with smooth and instantaneous PSD estimates indicate better speech enhancement
performance for the latter. A MATLAB implementation is available online
On the Convergence of the Multipole Expansion Method
The multipole expansion method (MEM) is a spatial discretization technique
that is widely used in applications that feature scattering of waves from
circular cylinders. Moreover, it also serves as a key component in several
other numerical methods in which scattering computations involving arbitrarily
shaped objects are accelerated by enclosing the objects in artificial
cylinders. A fundamental question is that of how fast the approximation error
of the MEM converges to zero as the truncation number goes to infinity. Despite
the fact that the MEM was introduced in 1913, and has been in widespread usage
as a numerical technique since as far back as 1955, to the best of the authors'
knowledge, a precise characterization of the asymptotic rate of convergence of
the MEM has not been obtained. In this work, we provide a resolution to this
issue. While our focus in this paper is on the Dirichlet scattering problem,
this is merely for convenience and our results actually establish convergence
rates that hold for all MEM formulations irrespective of the specific boundary
conditions or boundary integral equation solution representation chosen.Comment: 21 pages, 2 figures; Corrected a scaling error that occurred when
plotting the third columns of Figs 1,2,3, some very minor grammatical edits
to the intro/conclusion to improve clarity and conciseness, included funding
info in first page; updated intro with historical info; reformatted several
sections to reduce no. of pages; changed title, shortened abstract; fixed
typo in proof of Thm 1.
Low-Complexity Steered Response Power Mapping based on Nyquist-Shannon Sampling
The steered response power (SRP) approach to acoustic source localization
computes a map of the acoustic scene from the frequency-weighted output power
of a beamformer steered towards a set of candidate locations. Equivalently, SRP
may be expressed in terms of time-domain generalized cross-correlations (GCCs)
at lags equal to the candidate locations' time-differences of arrival (TDOAs).
Due to the dense grid of candidate locations, each of which requires inverse
Fourier transform (IFT) evaluations, conventional SRP exhibits a high
computational complexity. In this paper, we propose a low-complexity SRP
approach based on Nyquist-Shannon sampling. Noting that on the one hand the
range of possible TDOAs is physically bounded, while on the other hand the GCCs
are bandlimited, we critically sample the GCCs around their TDOA interval and
approximate the SRP map by interpolation. In usual setups, the number of sample
points can be orders of magnitude less than the number of candidate locations
and frequency bins, yielding a significant reduction of IFT computations at a
limited interpolation cost. Simulations comparing the proposed approximation
with conventional SRP indicate low approximation errors and equal localization
performance. MATLAB and Python implementations are available online
Detection and restoration of click degraded audio based on high-order sparse linear prediction
Clicks are short-duration defects that affect most archived audio media. Linear prediction (LP) modeling for the representation and restoration of audio signals that have been corrupted by click degradation has been extensively studied. The use of high-order sparse linear prediction for the restoration of clickdegraded audio given the time location of samples affected by click degradation has been shown to lead to significant restoration improvement over conventional LP-based approaches. For the practical usage of such methods, the identification of the time location of samples affected by click degradation is critical. High-order sparse linear prediction has been shown to lead to better modeling of audio resulting in better restoration of click degraded archived audio. In this paper, the use of high-order sparse linear prediction for the detection and restoration of click degraded audio is proposed. Results in terms of click duration estimation, SNR improvement and perceptual audio quality show that the proposed approach based on high-order sparse linear prediction leads to better performance compared to state of the art LP-based approaches. 
Sampling Rate Offset Estimation and Compensation for Distributed Adaptive Node-Specific Signal Estimation in Wireless Acoustic Sensor Networks
Sampling rate offsets (SROs) between devices in a heterogeneous wireless
acoustic sensor network (WASN) can hinder the ability of distributed adaptive
algorithms to perform as intended when they rely on coherent signal processing.
In this paper, we present an SRO estimation and compensation method to allow
the deployment of the distributed adaptive node-specific signal estimation
(DANSE) algorithm in WASNs composed of asynchronous devices. The signals
available at each node are first utilised in a coherence-drift-based method to
blindly estimate SROs which are then compensated for via phase shifts in the
frequency domain. A modification of the weighted overlap-add (WOLA)
implementation of DANSE is introduced to account for SRO-induced full-sample
drifts, permitting per-sample signal transmission via an approximation of the
WOLA process as a time-domain convolution. The performance of the proposed
algorithm is evaluated in the context of distributed noise reduction for the
estimation of a target speech signal in an asynchronous WASN.Comment: 9 pages, 6 figure
Regularized Adaptive Notch Filters for Acoustic Howling Suppression
Publication in the conference proceedings of EUSIPCO, Glasgow, Scotland, 200
- …